{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# <center/>混合精度训练体验"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 概述\n",
    "\n",
    "神经网络训练的时候，数据和权重等各种参数一般使用单精度浮点数（float32）进行计算和存储。在采用复杂神经网络进行训练时，由于计算量的增加，机器的内存开销变得非常大。经常玩模型训练的人知道，内存资源的不足会导致训练的效率变低，简单说就是训练变慢，有没有什么比较好的方法，在不提升硬件资源的基础上加快训练呢？这次我们介绍其中一种方法--混合精度训练，说白了就是将参数取其一半长度进行计算，即使用半精度浮点数（float16）计算，这样就能节省一半内存开销。当然，为了保证模型的精度，不能把所有的计算参数都换成半精度。为了兼顾模型精度和训练效率，MindSpore在框架中设置了一个自动混合精度训练的功能，本次体验我们将使用ResNet-50网络进行训练，体验MindSpore混合精度训练和单精度训练的不同之处。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "整体过程如下："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. MindSpore混合精度训练的原理介绍。\n",
    "2. 数据集准备。\n",
    "3. 定义ResNet-50网络。\n",
    "4. 定义`One_Step_Time`回调函数。\n",
    "5. 定义训练网络（此处设置自动混合精度训练参数`amp_level`）。\n",
    "6. 验证模型精度。\n",
    "7. 混合精度训练和单精度训练的对比。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 本文档适用于GPU和Ascend环境。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## MindSpore混合精度训练原理介绍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/training/source_zh_cn/advanced_use/images/mix_precision.PNG)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. 参数以FP32存储；\n",
    "2. 正向计算过程中，遇到FP16算子，需要把算子输入和参数从FP32 `cast`成FP16进行计算；\n",
    "3. 将Loss层设置为FP32进行计算；\n",
    "4. 反向计算过程中，首先乘以Loss Scale值，避免反向梯度过小而产生下溢；\n",
    "5. FP16参数参与梯度计算，其结果将被cast回FP32；\n",
    "6. 除以`Loss scale`值，还原被放大的梯度；\n",
    "7. 判断梯度是否存在溢出，如果溢出则跳过更新，否则优化器以FP32对原始参数进行更新。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从上可以理解(float16为半精度浮点数，float32为单精度浮点数)，MindSpore是将网络中的前向计算部分`cast`成半精度浮点数进行计算，以节省内存空间，提升性能，同时将`loss`值保持单精度浮点数进行计算和存储，`weight`使用半精度浮点数进行计算，单精度浮点数进行保存，通过这样操作即提升了训练效率，又保证了一定的模型精度，达到提升训练性能的目的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据集准备"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下载并解压数据集cifar10到指定位置。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2020-12-16 15:37:40--  https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/datasets/cifar10.zip\n",
      "Resolving proxy-notebook.modelarts-dev-proxy.com (proxy-notebook.modelarts-dev-proxy.com)... 192.168.0.172\n",
      "Connecting to proxy-notebook.modelarts-dev-proxy.com (proxy-notebook.modelarts-dev-proxy.com)|192.168.0.172|:8083... connected.\n",
      "Proxy request sent, awaiting response... 200 OK\n",
      "Length: 166235630 (159M) [application/zip]\n",
      "Saving to: ‘cifar10.zip’\n",
      "\n",
      "cifar10.zip         100%[===================>] 158.53M   154MB/s    in 1.0s    \n",
      "\n",
      "2020-12-16 15:37:41 (154 MB/s) - ‘cifar10.zip’ saved [166235630/166235630]\n",
      "\n",
      "Archive:  cifar10.zip\n",
      "   creating: ./datasets/cifar10/\n",
      "   creating: ./datasets/cifar10/test/\n",
      "  inflating: ./datasets/cifar10/test/test_batch.bin  \n",
      "   creating: ./datasets/cifar10/train/\n",
      "  inflating: ./datasets/cifar10/train/batches.meta.txt  \n",
      "  inflating: ./datasets/cifar10/train/data_batch_1.bin  \n",
      "  inflating: ./datasets/cifar10/train/data_batch_2.bin  \n",
      "  inflating: ./datasets/cifar10/train/data_batch_3.bin  \n",
      "  inflating: ./datasets/cifar10/train/data_batch_4.bin  \n",
      "  inflating: ./datasets/cifar10/train/data_batch_5.bin  \n",
      "./datasets/cifar10\n",
      "├── test\n",
      "│   └── test_batch.bin\n",
      "└── train\n",
      "    ├── batches.meta.txt\n",
      "    ├── data_batch_1.bin\n",
      "    ├── data_batch_2.bin\n",
      "    ├── data_batch_3.bin\n",
      "    ├── data_batch_4.bin\n",
      "    └── data_batch_5.bin\n",
      "\n",
      "2 directories, 7 files\n"
     ]
    }
   ],
   "source": [
    "!wget -N https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/datasets/cifar10.zip\n",
    "!unzip -o cifar10.zip -d ./datasets\n",
    "!tree ./datasets/cifar10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据增强"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "先将CIFAR-10的原始数据集可视化："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "the cifar dataset size is : 50000\n",
      "the tensor of image is: (32, 32, 3)\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD5CAYAAADhukOtAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfyElEQVR4nO2daYxc55We31N7sxc2e2GzuTXFTaOVlNAjS7LG0nhgj8YzieRkYthBHAURhs5gDMTA5IfgALGTX54gtuAfgRM6FkaTOLLlDZYTzYwlQRNFY4sWtZGUqIWkuDWbvVcvVb3UcvKjigElf+/tFru7mtZ9H4Bg9Xf61P3qu/fUrf7eOueYu0MI8eEnsdYTEEI0BgW7EDFBwS5ETFCwCxETFOxCxAQFuxAxIbUcZzO7F8A3ASQB/Dd3/1rU73d1dXlfX1/QdiUSoJlR25VKihFPiUq1Gh6vlKlPKpXmx0LEwSJIJK7+9+iFhYXgeNTcU6llXY5B2HVQ9fC5BKLPi4P7IeKaSySS3I8wNz9LbdlMLjh+5sxZjI2NBV/AFa+umSUB/GcAnwBwHsCLZvaEu7/BfPr6+vDCoReCtrm5OXostlCpJF/ASqVCbVUStACQSvHnLMwWguPjk+PUp6enh9qswi/8ZMRra2pqojZG1Jtf5PtihDHqzfbChQvB8VwufJECQGdnZ8Q0ogKJr2O5HH4jLs4WqU864k2nVOZ+Uddcc3MrtVUr4evx5CkaSti+7drg+N1330N9lnOLuA3ACXc/5e4LAL4H4L5lPJ8QYhVZTrBvAXDusp/P18eEEFchq/7Hn5kdMLPDZnZ4dHR0tQ8nhCAsJ9gHAGy77Oet9bH34O4H3b3f3fu7urqWcTghxHJYTrC/CGCPmV1jZhkAnwXwxMpMSwix0lzxbry7l83siwD+FjXp7RF3fz3KZ35+Hu+ceidoe/rvnqJ+ba0dwfH2Dv5JoVSZp7bJSf7nxPT0Rf6c5fBuazbHd1o7NmygtoUFLtl1dPCd6fUtLdSWTodPaXc3307JZZupDVV+PxgaOE9t33/sfwTH01muJNz7h/+A2ro38vVoXsfXf2RsJDh+7I1D1GdoKHyNAsDo+BlqKxS4VNa9YSO1rcuF12R8fIz67Nl9c3B8bHyY+ixL2HT3JwE8uZznEEI0hqv/2xlCiBVBwS5ETFCwCxETFOxCxAQFuxAxYeXTjKIwr/0LMJrnMsPJMyeD482t6/mhUvx9bHqKS28TEdJKIpUJjndvvIb6jOT5sVqauYQ2X+XS4YUhnjSUy2WD4x6RddXR0Uttk+Ph5B8A+OXzz1PbG2++FRzv6t1OfY6+9S617Wvi8uDkLJe8zp87Fxx/5903qc+bx7ksl58KJ/gAQKnCk2RQikjMKoUl2JkZ/rraWjcFx1niD6A7uxCxQcEuRExQsAsRExTsQsQEBbsQMaGhu/GJRBLrSNLCpp6t1G96Olz2KZ8foj7ZiISLqN34mZlpasuQhIVyNWJ3vIkrBqkEL+tkzndvkxFv0Qtz+eB4YZqrHesyPJGkWOC78ZVKuM4cwGv5ZUjttJqN77gnUtyvanz9S0TVyEQkL7WsD+90A0Bhga9Hc7Kb2uanJ6mtuJAPGyLq5LW1hMudJRM8pHVnFyImKNiFiAkKdiFigoJdiJigYBciJijYhYgJjU2EAWAWfn9JREgGTMfJRyTPZDO87dLo+CC1FYsR0hvpWpO0U3weEV1TpiKklUSKy0ndEXXtctmwRDV0MZwQAgCo8rUqlfh5yU/mqW1yMnxuKuf4Wt00209tlXKJ2nJNfP6JZNjW1sbrF15/w23U1jMRbl8GAMmIcBo4zRNvhgbCiV5TeV5P7uLFcP2/Uomvk+7sQsQEBbsQMUHBLkRMULALERMU7ELEBAW7EDFhWdKbmZ0GMA2gAqDs7lw7AZBMJNFG6q5t38zbE42NhOt+nTxxnPoMFfLUNhNhm52NqCOGsFSWn+Ato8qkZRQA9G7mtesWZnmWVLnM5cHmdeFWWS1NvAZdKsFr4WXTXOYrz/PX5pXwWk1ORLTempygNgOXMFM5nhHX3h3ODpuZ5xl7pQV+DWzauIv7kVpyAJBIcHkQFq4bWAWXB6/ZdW1wPEOkV2BldPbfdXf1YhbiKkcf44WICcsNdgfwczN7ycwOrMSEhBCrw3I/xt/l7gNmthHAU2b2prs/d/kv1N8EDgDA1m28Go0QYnVZ1p3d3Qfq/w8D+AmAX/tSsbsfdPd+d+/v6uQbDkKI1eWKg93Mms2s9dJjAJ8EcGylJiaEWFmW8zG+B8BPrJbVlQLwP939b6IczAy5dLiFUlOaF4hsbw3LJy25NuozMsTb9OTH89RWKHLZJZEIt65y4/OYi5De5ktc/nHjEsqZQZ45ZuWzwfHu1h3UZ3aMtxlKpzqprTDD14p0+UI6FZGhRtprAUAuHVFIcZCf6+SJt4PjbQW+9pmtm6kt284/nU7PzFBbdzfPlkulw9dP33YuN954w0eC400RbbKuONjd/RSAfVfqL4RoLJLehIgJCnYhYoKCXYiYoGAXIiYo2IWICQ0vOFlBWJOpWlTRwHXB8WSKZ2tNTYd7fAHA+Djv1wUnmhGAbFNYDstkeY+vajWc0QQA5TI/VhPpiQcAHRt2UluRFCkcPP8u9ZkdnaK2ltbt1HZxkPfamymE5bxtnbyP2u4sX4/1rz9PbRN//yy1zb8TliIHJ3jmYDnDM/32/qPPUFvnnXdSW0QdSHR3hNe4Jcuv7+6usDwYKW3yKQghPkwo2IWICQp2IWKCgl2ImKBgFyImNHQ33gGUquHaZAsVXr9rcjq8Wzw3z3dvF0r8faxU5S87meB+69rCu+4tG3qpD6svBgBFsmMNAMkkf229vTypYqE9vNt9cpYnJI6MjFPbxChPdsnwU4ZP3npDcPyWvbupz/WTJ6htYYTvnheSfB2nN4cTeV6Y4s/nY7w92PEf/JDaPr6Fn5fuzVyFYMlh7S1ckWlqCitUFnH96s4uRExQsAsRExTsQsQEBbsQMUHBLkRMULALERMaK725o1IJ12Qrlbh8Uijmw+MRrZqam9upbW4hQjMyXvdrPZHY1uXWU59MhtdVq0TIjaWIOZYisiq6urcRC0+qmG/mNe12kXZdANDVwusGrmshbb62sPkBGBugpuJsRA26OX7OBg8dDo6nB7nc2PXbvNraDKkXBwCv/eKX1PaJ+++nto6ecMuupjSXbVvJ+iYTvM2X7uxCxAQFuxAxQcEuRExQsAsRExTsQsQEBbsQMWFR6c3MHgHwRwCG3f3G+lgHgO8D2AHgNIDPuPvEUg5YrYazubIR7Z82bwl3fz0/xLOTPEJCa2rhUplX+ftfS0u4Nlk2E85AAoBMmmevWcSx3LmEUilHvLZUeB1/ZzfPoOru4zXXMqyPE4ChCPkqPx/Obpxfx9cqOcFlvqFz56ht5Fy47h4AjJ+5GByv5vPUp0jq1gHA73z+X1LbphtupbbmdbwtkxG5LJXmLaoAZguvO7C0O/tfArj3fWMPAXjG3fcAeKb+sxDiKmbRYK/3W3//W/h9AB6tP34UwP0rOy0hxEpzpX+z97j7pc/QF1Hr6CqEuIpZ9gaduztAisEDMLMDZnbYzA6PjY4u93BCiCvkSoN9yMx6AaD+P90hcfeD7t7v7v2dXby3tRBidbnSYH8CwAP1xw8A+OnKTEcIsVosRXp7DMA9ALrM7DyArwD4GoDHzexBAGcA8J44730u5LLhTJ62lnbq174+vCVw0w23UJ+ODbwl0+Agl+zm57jckUqH5ZOFKm81BfDstew6Lq81NfE2PtkUl6+2psJZdltyPINqfCJPbblWfok0b+Kf1NpS4bWaP3+e+pz9+0PUduQ5bjs7nqe2UiUsU86VuUT1kb37qe1j9/9jaqs0c0k3Pz5GbeX5ueB4pnWGH6sczhJ159fbosHu7p8jpt9bzFcIcfWgb9AJERMU7ELEBAW7EDFBwS5ETFCwCxETGlpwMmGGDOlF1drM5aSN7e3B8VTE7C3BM8NSaS5rjY6Es6QAwD0XNkQob5kcl9csEe5hBwA7ezZTW8sML8557OmfBcdHSOYgALS18IysrlS4GCIAbOjm8ubwcFhOGn/9HeqTvxAhiU4WqC0bkQVYKYelqM2dG6nPPQcepLZMV7h3HAAMnuPXzuxceD0AoLklfO2fOs0TSUuTR8PHKfJ10p1diJigYBciJijYhYgJCnYhYoKCXYiYoGAXIiY0VHozM6SJ7NXZxTOGEumwtJKa4LJWOsd7rKXJ8wFAsTBCbV4NH6+1jcs4Xpmktp6IXmm3bQn3lQOA6hzvcXcxHZaaEgvhHnsAMDbKCzbu3NNHbfMF3nNurhDO2KpWIgoiFrmG2dPOC2Y2R2SwJefCMuXmfTdTn/Zd11DbbJGv/UxhmtqmZ7gktlAOr+Px105Sn/Hzb4fnMM1lWd3ZhYgJCnYhYoKCXYiYoGAXIiYo2IWICQ3djQeABEmEaW7iiTCWCu/gp5p4XbVcE0laAVAo8B3yDe28FdLoUDjRYW6GJyzc2Mufb9t6Pv93XnmN2jbv5Ekt19/aHxxvXs/VjnSC1y0rT/Nd5NffOE1thZHwznSVHwrrNvAkk0yEmrCJd6jC1HR4Ht39t1GfdER7sLmIa2e+wmvGTUzzVlmzs2G/vTt4MlRmd9j23cd/SH10ZxciJijYhYgJCnYhYoKCXYiYoGAXIiYo2IWICUtp//QIgD8CMOzuN9bHvgrgTwBcyhr5srs/uZQDmoWTUHKkbVEN8p7ES6fBqzw5ojOijlj3MLcde+EXwfGpEd7aJzvME1pOO5e1+n97P7XtvYEncWSIhDk5zpNdzp7jtd8mhnidvLkZnrjS4uFzljeuk8238hOaSPKu4AsFnpyS3LQjOL7z9z9FfYznV2F+nq9HdYFLb2nj0mFnc3itSuO8VVb3jj3B8VSK37+Xcmf/SwD3BsYfdvf99X9LCnQhxNqxaLC7+3MA+DcChBC/ESznb/YvmtkRM3vEzPjXxIQQVwVXGuzfArALwH4AgwC+zn7RzA6Y2WEzOzwywgtDCCFWlysKdncfcveKu1cBfBsA/aKxux9093537++OaCoghFhdrijYzezyLeZPAzi2MtMRQqwWS5HeHgNwD4AuMzsP4CsA7jGz/QAcwGkAX1juRBIRckcWYbkumeJZY9Uqr++2qZPLa3O9PNNo73XXB8d/Ofw89Rkc4PLJFw58ltp279lGbaVZLv+8dTT8vpuf5DJZYYq3JkKVS2XlIpeaJithv6l5LpMlShEpcUnesqvcw7MA7/wn/zw4vvnGa6nP8NgFajv84ivUdubkaWrbsZXP8dTZcEusY68cpj79d4bPWbHIa9AtGuzu/rnA8HcW8xNCXF3oG3RCxAQFuxAxQcEuRExQsAsRExTsQsSEhhecpPCOTEiSTB6v8EyiDGkzBQAtrbyV0IYIWe7aa3cHx3dGtK7aaFwK2XkNb600cZFnqZ05e5baKrnways6X6t8icty5XkuvTVneZYaa5NUauI+WVJYFACKSf6N7P2/fx+17f3d28PzKPPX/Nf/+6+p7ekn/47aWtZxuXd8kJ+zN46FJbaB8zwbsW1TuEXV/PwC9dGdXYiYoGAXIiYo2IWICQp2IWKCgl2ImKBgFyImXD3SWxQe1uWq4LIQK2wJAOkUT7FrshK1tZbCWV7NGzuoz5tvhTOaAODcs4eorVLi8xge4lXCck3h11ac48+3tW8XtRWdZ8QNj/NCm2kLF/zMtXGZcrzKi47OVluorX0Tzyg7efx4cDw/NUp9zpx8l9q6N3DZdnKSr8f/PXSU2kpzYZmyexMvsrm9L/yaMxkuX+rOLkRMULALERMU7ELEBAW7EDFBwS5ETLhqduPd+c46IxFRlyxV4rvI+VOvU9uzP/sZtZ29GC6FvXdfP/Xp3vdxaisu8DlmM3xnup3ncGDsXHj3/9pmXt9teobPI9Oxkdp6toUTgwDglcPh5I6ho29Tn3xECbrT589R25lTr1JbaT68WKUFfr3lSAstAEhUecuu8bEBasvk+PE2bwuvcWmeL8iGznBiUDLFQ1p3diFigoJdiJigYBciJijYhYgJCnYhYoKCXYiYsJT2T9sA/BWAHtTaPR1092+aWQeA7wPYgVoLqM+4+8SVTiSiBB1NainP8MM9/9PvUdvP/w9v19S5+wZq++N/9U+D472beoPjAFCt8tpvp8+cprZigUs8m/aE648BwB39NwfHp8+coD7t23irqc7e7dTW1MzryV1/x0eD44/9V9rwF/OneausmSneNupXL/A2SW1N7cHxm27aR32KC7y91usneVvDri1t1Laula/VyHT4eIVJfu0cevlI2IfU/gOWdmcvA/hzd78ewO0A/szMrgfwEIBn3H0PgGfqPwshrlIWDXZ3H3T3l+uPpwEcB7AFwH0AHq3/2qMA7l+lOQohVoAP9De7me0AcAuAQwB63P1SrduLqH3MF0JcpSw52M2sBcCPAHzJ3d/zR4bXvusa/D6gmR0ws8NmdnhkJPx1UyHE6rOkYDezNGqB/l13/3F9eMjMeuv2XgDBrgbuftDd+929v7u7eyXmLIS4AhYNdqtthX8HwHF3/8ZlpicAPFB//ACAn6789IQQK8VSst4+CuDzAI6a2av1sS8D+BqAx83sQQBnAHxmOROpJvj7jpXCMtTfPPED6nP8BJdxPv0v/pTabti/n9qq1XBdtdkFnp00NTVJba0tPLuqUuJtfGYm+J9DzZ3hDKr0Bp69tmUXr0FXXgi/ZgCYmZ6mtr6dYcnutrt5FmAy8xK1dfRwCXB6KlwbEAByuVxwfGrmIvUpVHh9uu5reA29qSKXS8+d4s9ZrYTDsDTPMx839obl0lSa+ywa7O7+PLgM/nuL+Qshrg70DTohYoKCXYiYoGAXIiYo2IWICQp2IWLCVVNwMlHlBflK1fA0b7qdiwH3/EOeidbayiWv4vwstc0SGSplvJ1UcxNvW0SUPADA5ATPXpoY53JeuRQWTrLOD5Y+dYraorK1skl++RQmgt+xwraIVlmze3kBy5Onz1LbxGT4WABw9I1wcdGFKj/PbZu4fDU8xltvjeX5OauU+TWSzmSD483tXdSnd2tYisxEFCrVnV2ImKBgFyImKNiFiAkKdiFigoJdiJigYBciJlw10lulwjPHEomwbNG3MyJbK6LQY3GWN0ubK3G/uTKxlagLEs4ll2SiidqyuVZqK0zzAw6cCWdzDZ14jfqsa+JyTV9EccsNLVxWHB0OFwN96wTv9XbyFLdduDBEbW+/fZLakk1hyXHXdTuoT76Qp7bxcV6Mcm6WX8PpFD/Xrevbg+PrOzupTypDQpcUZwV0ZxciNijYhYgJCnYhYoKCXYiYoGAXIiZcNbvxCxW+C85qv5Ui6qNF5JigGrF7Ps83VDG/EE7WSVSimldFqAwRyRjNKf4+nG7jO/WtyfArHzvDd9zPDfCdbn5WgPkir/02OByuufaLFw9Rn8kxntDS0c6Tl3b0baC2SjZ8iZcrETX+IlpveYUnbCUS3JZMcVsuGz7Xbet5ElIqGfaxiEZqurMLERMU7ELEBAW7EDFBwS5ETFCwCxETFOxCxIRFpTcz2wbgr1BryewADrr7N83sqwD+BMClXkRfdvcno57L3bGwENa9xvO8lZATOcEjar8lPEIiiUiSmYuyTYeTIMpzfO5zc1zGuXhhkNrm5yKSKpDmzzn0bnC8OBeRwFHiiUEvvPA8tV0Y4i2Ucolw26XWCGmznOVSUxH8XO/ey5N1rr3l2uD4C8+9QH2SGX4NtLbxeTiplQgASHHps1wJn5tUhp+XHlLLL53m81uKzl4G8Ofu/rKZtQJ4ycyeqtsedvf/tITnEEKsMUvp9TYIYLD+eNrMjgPYstoTE0KsLB/ob3Yz2wHgFgCXvgb1RTM7YmaPmBn/GpMQYs1ZcrCbWQuAHwH4krtPAfgWgF0A9qN25/868TtgZofN7PDoKG9bK4RYXZYU7GaWRi3Qv+vuPwYAdx9y94q7VwF8G8BtIV93P+ju/e7e39XFi94LIVaXRYPdzAzAdwAcd/dvXDZ+ecuVTwM4tvLTE0KsFEvZjf8ogM8DOGpmr9bHvgzgc2a2HzU57jSALyzlgKxE1sQob2lUIRJbJsF1nHKE1GRVnvFUnOcpcSfffic4PjuTpz5jE/x1nb/I/6yZIO2TAGB6mkt961vDdctS2bAUBgBHjr5EbTMFfqybI9o1bW1bHxwfGOeZcnPDeWqbLHC//AQ/Z7OTYb9smvtElNZDKh1u1QQARmolAsDMXESqZSo8x9niSHAcAOaI3MsyRIGl7cY/DwSF7khNXQhxdaFv0AkRExTsQsQEBbsQMUHBLkRMULALERMaWnDS3VEqkay34fPUj2VsLUS06SmT7DoAuDjIZa2Tp09z27vhNkO9Xd3c5zx/vtkSLw64o++3qO22u+6mto985KPB8USCn+qfP/2/qO2VF5+ltmya3ysSvduD43t2h7O1AKD13AC1zRS4hDla4H7HjoVlxXWtPKOsdx3X3goFLtvORci2FeOSWAnh55zK89d1fiCc3bgQkcGoO7sQMUHBLkRMULALERMU7ELEBAW7EDFBwS5ETGh4rzcnhSArRZ4B9su/DefcJEhmFQBcHOXy2rFjb1LbTIS00tQS7rHW0s6LCfbtvYPadu2+jtruupPLa5t6NlJbOhOW81IpLvP1bnmQ2rZu7qW2k+feoLa9190SHO9s5TLl6ZZwViEAHDv+IrUlI3r+JXLh4pHJJuqCTEQhU4uQMBMpLr1FFTKtzIXHPcEz/cpVUsjUI9aCWoQQHyoU7ELEBAW7EDFBwS5ETFCwCxETFOxCxISGSm+zxSKOvPZa0Pbw1x+mfuPDYVku2cqLKE5H1Pdb39lHbftuupHatm8L9xTbuGkT9enqDBeABIC+bVzW2tjDZcX1LbzXWzZLZMAKl2QsxWWhO+4JZ9EBQOoQlymnxi4ExzPGCzZ6it97hsZ4VmQhojBjSypclDThERl7XHkDIoqczjnPOJsr87VKpEhB1aZ11Ke5OZyZl0hGvC5qEUJ8qFCwCxETFOxCxAQFuxAxQcEuRExYdDfezHIAngOQrf/+D939K2Z2DYDvAegE8BKAz7s733JErU9UhewKd/byHfIusgt+3Y23Up+eLTuobWqG7z5v2bqV2jo724Lj2TRPhEmwflcArtnBk0JyOb4lnI7YcU0mwsczsuMLALW+nWEKEfO4btcOajt3Lrx7XpznyR1NTXweu3buoraIkoL4rRs2B8c9yduDVViSCYCk8QyacoWrQxvaw/MAgDOD4Xpyswt8PTKp5uC4Rdy/l3JnnwfwcXffh1p75nvN7HYAfwHgYXffDWACAE+dEkKsOYsGu9e49Hacrv9zAB8H8MP6+KMA7l+NCQohVoal9mdP1ju4DgN4CsBJAHl3v/QNg/MAtqzKDIUQK8KSgt3dK+6+H8BWALcB4EXN34eZHTCzw2Z2OJ/PX9EkhRDL5wPtxrt7HsCzAO4A0G5mlzb4tgIIVrR394Pu3u/u/e3t7cuYqhBiOSwa7GbWbWbt9cdNAD4B4DhqQf/H9V97AMBPV2mOQogVwFhNuP//C2Y3o7YBl0TtzeFxd/8PZrYTNemtA8ArAP6Ze0QmAIB9+/b5k0+G68ldGOCJDpt6w7LFhg6eZFIu84SFKHkil4tI1Iio70V9Imw5lrSyiOcipyyIRUiAUcdaiGij5c4lzGJxNvx8JX5eouZhxm2Dg+GkGwDY3heWUgtFLr0tLExTW2sLl0uzGd42qinHk1ryUxPB8UrEiW7OhaW3u+++G6+8/ErwZC+qs7v7EQC/Vj3Q3U+h9ve7EOI3AH2DToiYoGAXIiYo2IWICQp2IWKCgl2ImLCo9LaiBzMbAXCm/mMXAN7zqXFoHu9F83gvv2nz6HP3oD7Y0GB/z4HNDrt7/5ocXPPQPGI4D32MFyImKNiFiAlrGewH1/DYl6N5vBfN4718aOaxZn+zCyEaiz7GCxET1iTYzexeM3vLzE6Y2UNrMYf6PE6b2VEze9XMDjfwuI+Y2bCZHbtsrMPMnjKzd+r/b1ijeXzVzAbqa/KqmX2qAfPYZmbPmtkbZva6mf3r+nhD1yRiHg1dEzPLmdmvzOy1+jz+fX38GjM7VI+b75tZVNrkr+PuDf2HWqrsSQA7AWQAvAbg+kbPoz6X0wC61uC4HwNwK4Bjl439RwAP1R8/BOAv1mgeXwXwbxq8Hr0Abq0/bgXwNoDrG70mEfNo6JoAMAAt9cdpAIcA3A7gcQCfrY//FwB/+kGedy3u7LcBOOHup7xWevp7AO5bg3msGe7+HIDx9w3fh1rdAKBBBTzJPBqOuw+6+8v1x9OoFUfZggavScQ8GorXWPEir2sR7FsAnLvs57UsVukAfm5mL5nZgTWawyV63H2w/vgigJ41nMsXzexI/WP+qv85cTlmtgO1+gmHsIZr8r55AA1ek9Uo8hr3Dbq73P1WAH8A4M/M7GNrPSGg9s6O6CI3q8m3AOxCrUfAIICvN+rAZtYC4EcAvuTu7ykl08g1Ccyj4WviyyjyyliLYB8AsO2yn2mxytXG3Qfq/w8D+AnWtvLOkJn1AkD9/4g+J6uHuw/VL7QqgG+jQWtitbY0PwLwXXf/cX244WsSmsdarUn92Hl8wCKvjLUI9hcB7KnvLGYAfBbAE42ehJk1m1nrpccAPgngWLTXqvIEaoU7gTUs4HkpuOp8Gg1YE6sVyPsOgOPu/o3LTA1dEzaPRq/JqhV5bdQO4/t2Gz+F2k7nSQD/do3msBM1JeA1AK83ch4AHkPt42AJtb+9HkStZ94zAN4B8DSAjjWax38HcBTAEdSCrbcB87gLtY/oRwC8Wv/3qUavScQ8GromAG5GrYjrEdTeWP7dZdfsrwCcAPADANkP8rz6Bp0QMSHuG3RCxAYFuxAxQcEuRExQsAsRExTsQsQEBbsQMUHBLkRMULALERP+Hzf0UOqfEKGJAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import mindspore.dataset as ds\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "train_path = \"./datasets/cifar10/train\"\n",
    "ds_train = ds.Cifar10Dataset(train_path, num_parallel_workers=8, shuffle=True)\n",
    "print(\"the cifar dataset size is :\", ds_train.get_dataset_size())\n",
    "dict1 = ds_train.create_dict_iterator()\n",
    "dict_data = next(dict1)\n",
    "image = dict_data[\"image\"].asnumpy()\n",
    "print(\"the tensor of image is:\", image.shape)\n",
    "plt.imshow(np.array(image))\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到CIFAR-10总共包含了50000张32×32的彩色图片。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 定义数据增强函数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "定义数据集增强函数并将原始数据集进行增强，查看数据集增强后张量数据："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "the cifar dataset size is: 1562\n",
      "the tensor of image is: (32, 3, 224, 224)\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from mindspore import dtype as mstype\n",
    "import mindspore.dataset as ds\n",
    "import mindspore.dataset.vision.c_transforms as C\n",
    "import mindspore.dataset.transforms.c_transforms as C2\n",
    "\n",
    "def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32):\n",
    "    \n",
    "    cifar_ds = ds.Cifar10Dataset(dataset_path, num_parallel_workers=8, shuffle=True)\n",
    "    \n",
    "    # define map operations\n",
    "    trans = []\n",
    "    if do_train:\n",
    "        trans += [\n",
    "            C.RandomCrop((32, 32), (4, 4, 4, 4)),\n",
    "            C.RandomHorizontalFlip(prob=0.5)\n",
    "        ]\n",
    "\n",
    "    trans += [\n",
    "        C.Resize((224, 224)),\n",
    "        C.Rescale(1.0 / 255.0, 0.0),\n",
    "        C.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),\n",
    "        C.HWC2CHW()\n",
    "    ]\n",
    "\n",
    "    type_cast_op = C2.TypeCast(mstype.int32)\n",
    "\n",
    "    cifar_ds = cifar_ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=8)\n",
    "    cifar_ds = cifar_ds.map(operations=trans, input_columns=\"image\", num_parallel_workers=8)\n",
    "\n",
    "    cifar_ds = cifar_ds.batch(batch_size, drop_remainder=True)\n",
    "    cifar_ds = cifar_ds.repeat(repeat_num)\n",
    "\n",
    "    return cifar_ds\n",
    "\n",
    "\n",
    "cifar_ds_train = create_dataset(train_path, do_train=True, repeat_num=1, batch_size=32)\n",
    "print(\"the cifar dataset size is:\", cifar_ds_train.get_dataset_size())\n",
    "dict1 = cifar_ds_train.create_dict_iterator()\n",
    "dict_data = next(dict1)\n",
    "image = dict_data[\"image\"].asnumpy()\n",
    "print(\"the tensor of image is:\", image.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "cifar10通过数据增强后的，变成了一共有1562个batch，张量为(32,3,224,224)的数据集。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 定义深度神经网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本篇使用的MindSpore中的ResNet-50网络模型，下载相关的代码文件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2020-12-16 15:37:46--  https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/source-codes/resnet.py\n",
      "Resolving proxy-notebook.modelarts-dev-proxy.com (proxy-notebook.modelarts-dev-proxy.com)... 192.168.0.172\n",
      "Connecting to proxy-notebook.modelarts-dev-proxy.com (proxy-notebook.modelarts-dev-proxy.com)|192.168.0.172|:8083... connected.\n",
      "Proxy request sent, awaiting response... 200 OK\n",
      "Length: 9533 (9.3K) [binary/octet-stream]\n",
      "Saving to: ‘resnet.py’\n",
      "\n",
      "resnet.py           100%[===================>]   9.31K  --.-KB/s    in 0s      \n",
      "\n",
      "2020-12-16 15:37:46 (169 MB/s) - ‘resnet.py’ saved [9533/9533]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!wget -N https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/source-codes/resnet.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下载后的文件在notebook的工作目录上，可以导出resnet50网络作为本案例的训练网络。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from resnet import resnet50\n",
    "\n",
    "network = resnet50(batch_size=32, num_classes=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 定义回调函数Time_per_Step来计算单步训练耗时"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`Time_per_Step`用于计算每步训练的时间消耗情况，方便对比混合精度训练和单精度训练的性能区别。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mindspore.train.callback import Callback\n",
    "import time\n",
    "\n",
    "class Time_per_Step(Callback):\n",
    "    def step_begin(self, run_context):\n",
    "        cb_params = run_context.original_args()\n",
    "        cb_params.init_time = time.time()\n",
    "        \n",
    "    def step_end(selfself, run_context):\n",
    "        cb_params = run_context.original_args()\n",
    "        one_step_time = (time.time() - cb_params.init_time) * 1000\n",
    "        print(one_step_time, \"ms\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 定义训练网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 设置混合精度训练并执行训练"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于MindSpore已经添加了自动混合精度训练功能，我们这里操作起来非常方便，只需要在Model中添加参数`amp_level=O2`就完成了设置GPU模式下的混合精度训练设置。运行时，将会自动混合精度训练模型。\n",
    "\n",
    "`amp_level`的参数详情：\n",
    "\n",
    "`O0`：表示不做任何变化，即单精度训练，系统默认`O0`。\n",
    "\n",
    "`O2`：表示将网络中的参数计算变为float16。适用于GPU环境。\n",
    "\n",
    "`O3`：表示将网络中的参数计算变为float16，同时需要在Model中添加参数`keep_batchnorm_fp32=False`。适用于Ascend环境。\n",
    "\n",
    "在`Model`中设置`amp_level=O2`后即可执行混合精度训练："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch: 1 step: 1, loss is 2.3136287\n",
      "23430.50193786621 ms\n",
      "epoch: 1 step: 2, loss is 2.307624\n",
      "58.97927284240723 ms\n",
      "epoch: 1 step: 3, loss is 2.3061028\n",
      "58.4864616394043 ms\n",
      "... ...\n",
      "epoch: 5 step: 1551, loss is 0.64406776\n",
      "57.350873947143555 ms\n",
      "epoch: 5 step: 1552, loss is 0.71218234\n",
      "57.8157901763916 ms\n",
      "epoch: 5 step: 1553, loss is 0.92978317\n",
      "57.538747787475586 ms\n",
      "epoch: 5 step: 1554, loss is 0.7753498\n",
      "57.379961013793945 ms\n",
      "epoch: 5 step: 1555, loss is 0.37733847\n",
      "57.35301971435547 ms\n",
      "epoch: 5 step: 1556, loss is 0.6176548\n",
      "57.685136795043945 ms\n",
      "epoch: 5 step: 1557, loss is 0.6364769\n",
      "59.25607681274414 ms\n",
      "epoch: 5 step: 1558, loss is 0.9768918\n",
      "58.919668197631836 ms\n",
      "epoch: 5 step: 1559, loss is 0.6273656\n",
      "59.133052825927734 ms\n",
      "epoch: 5 step: 1560, loss is 0.59349436\n",
      "57.468414306640625 ms\n",
      "epoch: 5 step: 1561, loss is 0.87086564\n",
      "57.49654769897461 ms\n",
      "epoch: 5 step: 1562, loss is 0.95843387\n",
      "56.33831024169922 ms\n",
      "Epoch time: 95750.256, per step time: 61.300\n"
     ]
    }
   ],
   "source": [
    "\"\"\"train ResNet-50\"\"\"\n",
    "import os\n",
    "import random\n",
    "import argparse\n",
    "from mindspore import context\n",
    "import mindspore.nn as nn\n",
    "from mindspore import Model\n",
    "from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor\n",
    "from mindspore.nn import SoftmaxCrossEntropyWithLogits\n",
    "\n",
    "\n",
    "if __name__ == '__main__':\n",
    "\n",
    "    context.set_context(mode=context.GRAPH_MODE, device_target=\"GPU\")\n",
    "    \n",
    "    model_path= \"./models/ckpt/mindspore_mixed_precision\"\n",
    "    batch_size = 32\n",
    "    epoch_size = 5\n",
    "    ds_train_path = \"./datasets/cifar10/train\"\n",
    "    \n",
    "    # clean up old run files before in Linux\n",
    "    os.system('rm -f {0}*.ckpt {0}*.meta {0}*.pb'.format(model_path))\n",
    "    # create dataset\n",
    "    train_dataset = create_dataset(dataset_path=ds_train_path, do_train=True, repeat_num=1,\n",
    "                                 batch_size=batch_size)\n",
    "    \n",
    "    # define net\n",
    "    net = network\n",
    "\n",
    "    # define \n",
    "    step_size = train_dataset.get_dataset_size()\n",
    "    lr = 0.01\n",
    "    momentum = 0.9\n",
    "    \n",
    "    # define opt, loss, model\n",
    "    loss = SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n",
    "    opt = nn.Momentum(network.trainable_params(), lr, momentum)\n",
    "    model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'},amp_level=\"O2\")\n",
    "    \n",
    "    # define callbacks function\n",
    "    steptime_cb = Time_per_Step()\n",
    "    time_cb = TimeMonitor(data_size=step_size)\n",
    "    loss_cb = LossMonitor()\n",
    "\n",
    "    cb = [time_cb, loss_cb, steptime_cb]\n",
    "\n",
    "    # train model\n",
    "    model.train(epoch_size, train_dataset, callbacks=cb, dataset_sink_mode=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 验证模型精度"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用模型进行精度验证可以得出以下代码。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: {'acc': 0.7717347756410257}\n"
     ]
    }
   ],
   "source": [
    "# Eval model\n",
    "eval_dataset_path = \"./datasets/cifar10/test\"\n",
    "eval_data = create_dataset(eval_dataset_path,do_train=False)\n",
    "acc = model.eval(eval_data,dataset_sink_mode=True)\n",
    "print(\"Accuracy:\",acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 对比不同网络下的混合精度训练和单精度训练的差别"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于篇幅原因，我们这里只展示了ResNet-50网络的混合精度训练情况。可以在主程序入口的Model中设置参数`amp_level = O0`进行单精度训练，训练完毕后，将结果进行对比，看看两者的情况，下面将我测试的情况做成表格如下。（训练时，笔者使用的GPU为Nvidia Tesla V100，不同的硬件对训练的效率影响较大，下述表格中的数据仅供参考）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|  网络 | 是否混合训练 | 单步训练时间 | epoch | Accuracy\n",
    "|:------  |:-----| :------- |:--- |:------  \n",
    "|ResNet-50 |  否  | 100ms   |  5 |  0.8128245 \n",
    "|ResNet-50 |  是  | 58ms   |  5 |  0.7717347"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "经过多次测试，使用ResNet-50网络，CIFAR-10数据集，进行混合精度训练对整体的训练效率提升了60%左右，而最终模型的精度有少量降低，对于使用者来说，混合精度训练在提升训练效率上，是一个很好的选择。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当然，如果你想参考单步训练或者手动设置混合精度训练，可以参考官网教程<https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/enable_mixed_precision.html>。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 总结"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本次体验我们尝试了在ResNet-50网络中使用混合精度来进行模型训练，并对比了单精度下的训练过程，了解到了混合精度训练的原理和对模型训练的提升效果。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "MindSpore-1.0.1",
   "language": "python",
   "name": "mindspore-1.0.1"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}